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We report a globally-optimal approach to robotic path planning under uncertainty, based on the theory of quantitative measures 
of formal languages. A significant generalization to the language-measure-theoretic path planning algorithm v* is presented that 
explicitly accounts for average dynamic uncertainties and estimation errors in plan execution. The notion of the navigation automa- 
ton is generalized to include probabilistic uncontrollable transitions, which account for uncertainties by modeling and planning for 
probabilistic deviations from the computed policy in the course of execution. The planning problem is solved by casting it in the 
form of a performance maximization problem for probabilistic finite state automata. In essence we solve the following optimization 
problem: Compute the navigation policy which maximizes the probability of reaching the goal, while simultaneously minimiz- 
ing the probability of hitting an obstacle. Key novelties of the proposed approach include the modeling of uncertainties using the 
concept of uncontrollable transitions, and the solution of the ensuing optimization problem using a highly efficient search-free com- 
binatorial approach to maximize quantitative measures of probabilistic regular languages. Applicability of the algorithm in various 
models of robot navigation has been shown with experimental validation on a two-wheeled mobile robotic platform (SEGWAY 
RMP 200) in a laboratory environment. 
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1. Introduction & Motivation 

The objective of this paper is to report a globally-optimal ap- 
proach to path planning under uncertainty, based on the theory 
of quantitative measures of formal languages. The field of tra- 
jectory and motion planning is enormous, with applications in 
such diverse areas as industrial robots, mobile robot navigation, 
spacecraft reentry, video games and even drug design. Many 
of the basic concepts are presented in [ 1 ] and in recent com- 
prehensive surveys [2]. In the context of planning for mobile 
robots and manipulators much of the literature on path and mo- 
tion planning is concerned with finding collision-free trajecto- 
ries [3]. A great deal of the complexity in these problems arises 
from the topology of the robot's configuration space, called the 
C-Space. Various analytical techniques, such as wavefront ex- 
pansion [J] and cellular decomposition, have been reported in 
recent literature (5D, which partition the C-Space into a finite 
number of regions with the objective of reducing the motion 
planning problem as identification of a sequence of neighboring 
cells between the initial and final (i.e., goal) regions. Graph- 
theoretic search-based techniques have been used somewhat 
successfully in many wheeled ground robot path planning prob- 
lems and have been used for some UAV planning problems, typ- 
ically radar evasion [6]. These approaches typically suffer from 
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complexity issues arising from expensive searches, particularly 
in complicated configuration spaces. To circumvent the com- 
plexity associated with graph-based planning, sampling based 
planning methods J7|] such as probabilistic roadmaps have been 
proposed. However, sampling based approaches are only prob- 
abilistically complete (i.e. if a feasible solution exists it will 
be found, given enough time) but there is no guarantee of find- 
ing a solution within a specified time, and more often than not, 
global route optimality is not guaranteed. Distinct from these 
general approaches, there exist reported techniques that explic- 
itly make use of physical aspects of specific problems for plan- 
ning e.g. use of vertical wind component for generating optimal 
trajectories for UAVs [8], feasible collision-free trajectory gen- 
eration for cable driven platforms (2D, and the recently reported 
approach employing angular processing lioll . 

1.1. Potential Field-based Planning Methodology 

Among reported deterministic approaches, methods based on 
artificial potential fields have been extensively investigated, of- 
ten referred to cumulatively as potential field methods (PFM). 
The idea of imaginary forces acting on a robot were suggested 
by several authors inclding ifTHl and d. In these approaches 
obstacles exert repulsive forces onto the robot, while the tar- 
get applies an attractive force to the robot. The resultant of all 
forces determines the subsequent direction and speed of travel. 
One of the reasons for the popularity of this method is its sim- 
plicity and elegance. Simple PFMs can be implemented quickly 
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and initially provide acceptable results without requiring many 
refinements. IU3I1 has suggested a generalized potential field 
method that combines global and local path planning. Potential 
field based techniques have been also successfully employed 
in multi-robot co-operative planning scenarios Bl4,ll5ll . where 
other techniques prove to be inefficient and impractical. 

While the potential field principle is particularly attractive 
because of its elegance and simplicity, substantial shortcom- 
ings have been identified that are inherent to this principle. The 
interested reader is referred to [16] for a systematic criticism of 
PFM-based planning, where the authors cite the underlying dif- 
ferential equation based analysis as the source of the problems, 
and the fact that it combines the robot and the environment into 
one unified system. Key problems inherent to PFMs, indepen- 
dent of the particular implementation, are: 

1 . Trap situations due to local minima: Perhaps the best- 
known problem with PFMs are possible trap-situations 
1 1 U 1 1711 . which occur when when the robot runs into a 



dead end, due to the existence of a local extrema in the po- 
tential field. Trap-situations can be remedied with heuris- 
tic recovery rules, which are likely to result in non-optimal 
paths. 

2. No passage between closely spaced obstacles: A severe 
problem with PFMs occurs when the robot attempts to 
travel through narrow corridors thereby experiencing re- 
pulsive forces simultaneously from opposite sides, leading 
to wavy trajectories, no passage etc. 

3. Oscillations in the presence of obstacles: Presence of 
high obstacle clutter often leads to unstable motion, due to 
the complexity of the resultant potential. 

4. Effect of past obstacles: Even after the robot has already 
passed an obstacle, the latter keeps affecting the robot mo- 
tion for a significant period of time (until the repulsive po- 
tential dies down). 

These disadvantages become more apparent when the PFM- 
based methods are implemented in high-speed real-time sys- 
tems; simulations and slow speed experiments often conceal 
the issues; probably contributing to the widespread popularity 
of potential planners. 

1.2. The v* Planning Algorithm 

Recently, the authors reported a novel path planning algo- 
rithm v* ll 811 . that models the navigation problem in the frame- 
work of Probabilistic Finite State Automata (PFSA) and com- 
putes optimal plans via optimization of the PFSA from a strictly 
control-theoretic viewpoint, v* uses cellular decomposition of 
the workspace, and assumes that the blocked grid locations can 
be easily estimated, upon which the planner computes an opti- 
mal navigation gradient that is used to obtain the routes. This 
navigation gradient is computed by optimizing the quantitative 
measure of the probabilistic formal language generated by the 
associated navigation automaton. The key advantages can be 
enumerated as: 

1 . v* is fundamentally distinct from a search: The search 
problem is replaced by a sequential solution of sparse lin- 
ear systems. On completion of cellular decomposition, v* 



optimizes the resultant PFSA via a iterative sequence of 
combinatorial operations which elementwise maximizes 
the language measure vector [ 19] [20]. Note that although 
v* involves probabilistic reasoning, the final waypoint se- 
quence obtained is deterministic. 

2. Computational efficiency: The intensive step in v* is a 
special sparse matrix inversion to compute the language 
measure. The time complexity of each iteration step can 
be shown to be linear in problem size implying significant 
numerical advantage over search-based methods for high- 
dimensional problems. 

3. Global monotonicity: The solution iterations are globally 
monotonic, i.e, each iteration yields a better approxima- 
tion to the final optimal solution. The final waypoint se- 
quence is generated essentially by following the measure 
gradient which has a unique maxima at the goal. 

4. Global Optimality: It can be shown that trap-situations 
are a mathematical impossibility for v*. 

The optimal navigation gradient produced by v* is reminiscent 
of potential field methods 0]. However, v* automatically gen- 
erates, and optimizes this gradient; no ad-hoc potential function 
is necessary. 

1.3. Focus of Current Work & Key Contributions 

The key focus of this paper is extension of the v* planning 
algorithm to optimally handle execution uncertainties. It is well 
recognized by domain experts that merely coming up with a 
navigation plan is not sufficient; the computed plan must be 
executed in the real world by the mobile robot, which often 
cannot be done exactly and precisely due to measurement noise 
in the exteroceptive sensors, imperfect actuations, and external 
disturbances. The idea of planning under uncertainties is not 
particularly new, and good surveys of reported methodologies 
exist 12 111 . In chronological order, the main family of reported 
approaches can be enumerated as follows: 



• Pre-image Back-chaining J22, 23, 24)] where the plan is 
synthesized by computing a set of configurations from 
which the robot can possibly reach the goal, and then 
propgating this preimage recursively backward or back- 
chaining, a problem solving approach originally proposed 
in Q. 



Appro ach based on sensory uncertainty fields (SUF) 12i 
HUES computed over the collision-free subset of the 



robot's configuration space, which reflects expected uncer- 
tainty (distribution of possible errors) in the sensed config- 
uration that would be computed by matching the sensory 
data against a known environment model (e.g. landmark 
locations). A planner then makes use of the computed SUF 
to generate paths that minimize expected errors. 



• Sensor-based planning approaches |3(X |3U |32j, which 
consider explicit uncertainty models of various motion 
primitives to compute a feasible robust plan composed 
of sensor-based motion commands in polygonal envi- 
ronments, with significant emphasis on wall-following 
schemes. 
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Information space based approach using the Bellman prin- 
ciple of stochastic dynamic programming [33], which in- 
troduced key concepts such as setting up the problem in a 
probabilistic framework, and demanding that the optimal 
plan maximize the probability of reaching the goal. How- 
ever, the main drawback was the exponential dependence 
on the dimension of the computed information space. 



• The set-membership approach I134tl which performs a lo- 
cal search, trying to deform a path into one that respects 
uncertainty constraints imposed by arbitrarily shaped un- 
certainty sets. Each hard constraint is turned into a soft 
penalty function, and the gradient descent algorithm is em- 
ployed, hoping convergence to an admissible solution. 

• Probabilistic approaches based on disjunctive linear pro- 
gramming B35l 13611 . with emphasis on UAV applications. 
The key limitation is the inability to take into account 
exteroceptive sensors, and also the assumption that dead- 
reckoning is independent of the path executed. Later ex- 
tensions of this approach use particle representations of the 
distributions, implying wider applicability. 



Adaptation of search strategies in extended spaces B371I38 , 
39l 4(J, which consider the classical search problem in 



configuration spaces augmented with uncertainty informa- 
tion. 

• Approach based on Stochastic Motion Roadmaps 
(SRM) 14111 . which combines sampling -based roadmap 
representation of the configuration space, with the theory 
of Markov Decision Processes, to yield models that can be 
subsequently optimized via value-iteration based infinite 
horizon dynamic programming, leading to plans that 
maximize the probability of reaching the goal. 

The current work adds a new member to the family of exist- 
ing approaches to address globally optimal path planning under 
uncertainties. The key novelty of this paper is the association of 
uncertainty with the notion of uncontrollability in a controlled 
system. The navigation automaton introduced in 01811 is aug- 
mented with uncontrollable transitions which essentially cap- 
tures the possibility that the agent may execute actuation se- 
quences (or motion primitives) that are not coincident with the 
planned moves. The planning objective is simple: Maximize 
the probability of reaching the goal, while simultaneously min- 
imizing the probability of hitting any obstacle. Note that, in this 
respect, we are essentially solving the exact same problem in- 
vestigated by [41]. However our solution approach is very dif- 
ferent. Instead of using value iteration based dynamic program- 
ming, we use the theory of language-measure-theoretic opti- 
mization of probabilistic finite state automata 112011 . Unlike the 
SRM approach, the proposed algorithm does not require the use 
of local auxiliary planners, and also needs to make no assump- 
tions on the structure of the configuration space to guarantee it- 
erative convergence. The use of arbitrary penalties for reducing 
the weight on longer paths is also unnecessary, which makes the 
proposed v* under uncertainties completely free from heuris- 
tics. We show that all the key advantages that v* has over the 



state-of-art carries over to this more general case; namely that 
of significantly better computational efficiency, simplicity of 
implementation, and achieving global optimality via monotonic 
sequence of search-free combinatorial iterative improvements, 
with guaranteed polynomial convergence. The proposed ap- 
proach thus solves the inherently non-convex optimization [42] 
by mapping the physical specification to an optimal control 
problem for probabilistic finite state machines (the navigation 
automata), which admits efficient combinatorial solutions via 
the language-measure-theoretic approach. The source of many 
uncertainties, namely modeling uncertainty, disturbances, and 
uncertain localization, is averaged over (or amortized) for ade- 
quate representation in the automaton framework. This may be 
viewed as a source of approximation in the proposed approach; 
however we show in simulation and in actual experimentation 
that the amortization is indeed a good approach to reduce plan- 
ning complexity and results in highly robust planning decisions. 
Thus the modified language-measure-theoretic approach pre- 
sented in this paper, potentially lays the framework for seam- 
less integration of data-driven and physics-based models with 
the high-level decision processes; this is a crucial advantage, 
and goes to address a key issue in autonomous robotics, e.g., in 
a path-planning scenario with mobile robots, the optimal path 
may be very different for different speeds, platform capabilities 
and mission specifications. Previously reported approaches to 
handle these effects using exact differential models of platform 
dynamics results in overtly complex solutions that do not re- 
spond well to modeling uncertainties, and more importantly to 
possibly non-stationary environmental dynamics and evolving 
mission contexts. Thus the measure-theoretic approach enables 
the development of true Cyber-Physical algorithms for control 
of autonomous systems; algorithms that operate in the logical 
domain while optimally integrating, and responding to, physi- 
cal information in the planning process. 



The rest of the paper is organized in seven sections. Sec- 
tion [2] briefly explains the language-theoretic models consid- 
ered in this paper, reviews the language-measure-theoretic op- 
timal control of probabilistic finite state machines and presents 
the necessary details of the reported v* algorithm. Section [3] 
presents the modifications to the navigation model to incor- 
porate the effects of dynamic uncertainties within the frame- 
work of probabilistic automata. Section [4] presents the perti- 
nent theoretical results and establishes the main planning al- 
gorithm. Section [5] develops a formulation to identify the key 
amortized uncertainty parameters of the PFSA-based naviga- 
tion model from an observed dynamical response of a given 
platform. The proposed algorithm is summarized with pertinent 
comments in Section|6] The theoretical development is verified 
in high-fidelity simulations on different navigation models and 
validated in experimental runs on the SEGWAY RMP 200 in 
section|7] The paper is summarized and concluded in Section[8] 
with recommendations for future work. 
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2. Preliminaries: Language Measure-theoretic Optimiza- 
tion Of Probabilistic Automata 

This section summarizes the signed real measure of reg- 
ular languages; the details are reported in II43I1 . Let G, = 
(Q,2>,S,qi,Q m ) be a trim (i.e., accessible and co-accessible) 
finite-state automaton model that represents the discrete-event 
dynamics of a physical plant, where Q = [q% : k e Iq] is the 
set of states and Iq = {1, 2, • ■ ■ , n\ is the index set of states; the 
automaton starts with the initial state qc the alphabet of events 
is E = [o- k : k e J £ ), having £ fl Iq = and J s = {1,2, • • ■ ,(} 
is the index set of events; 5 : Q x S — > 2 is the (possibly partial) 
function of state transitions; and Q m = {q mi ,qm 2 7 ■ " ><?»>;} £ 6 
is the set of marked (i.e., accepted) states with g„ I( = ^ for 
some j € Jg. Let X* be the Kleene closure of E, i.e., the set 
of all finite-length strings made of the events belonging to X 
as well as the empty string e that is viewed as the identity of 
the monoid X* under the operation of string concatenation, i.e., 
es = s — se. The state transition map 5 is recursively extended 
to its reflexive and transitive closure 5 : QxT." — > Q by defining 
Vqj £ g> <%j, e) = qj and e g,treI,ieS*, crs) = 
5(6(qi, cr), s) 

Definition 1. The language L(qi) generated by a DFSA G ini- 
tialized at the state qi e Q is defined as: L(qi) — {s € 
Z* | 6*(qi, s) 6 2} 77ie language L m (qi) marked by the DFSA 
G initialized at the state qi e Q is defined as: L m (qi) — {s € 
I* | 6\qi, s) € Q m ) 

Definition 2. For every q s e Q, let L(qi, qj) denote the set of 
all strings that, starting from the state q u terminate at the state 
q jt i.e., L u = [s e I* | £*(<?;, s) = qj e Q] 

The formal language measure is first defined for terminating 
plants l45ll with sub-stochastic event generation probabili- 
ties i.e. the event generation probabilities at each state summing 
to strictly less than unity. 

Definition 3. The event generation probabilities are specified 
by the function ft : S* x Q — > [0, 1] such that Sqj e Q,Vo~ k e 

(7) Jf(a- t ,9y) = TTji e [0, 1); = 1-6, with e (0, 1); 

(2) n(o~,qj) — if 6(qj,cr) is undefined; n(e,qj) — 1; 

(3) n{cr k s,qj) = n{cr k ,qj) n(s,6(qj, a*)). 

77je « X ^ evmf cosf matrix is defined as: DI|y = jr(g r ,-, cry) 

Definition 4. 77ie sfafe transition probability n : <2 x Q — > 
[0, 1), of the DFSA Gi is defined as follows: Vqi,qj e Q,7Ty = 
^ 7t(o~, q^ The nxn state transition probability ma- 

trix is defined as W\jk — Mqi, qj) 

The set Q m of marked states is partitioned into Qj n and Q m , 
i.e., Q,„ = Q + m U Q~ and Q* n Q" = 0, where Q+ contains 
all gooc/ marked states that we desire to reach, and Q~ contains 
all bad marked states that we want to avoid, although it may 
not always be possible to completely avoid the bad states while 



attempting to reach the good states. To characterize this, each 
marked state is assigned a real value based on the designer's 
perception of its impact on the system performance. 

Definition 5. The characteristic function x '■ Q ~ * l~ L 1] 

that assigns a signed real weight to state-based sublanguages 
L(qi, q) is defined as: 

I [-1,0), qeQ- 
VqeQ, X (q)e\ {0}, q * Q m (1) 
1 (0,U, qeQl 

The state weighting vector, denoted by x — \X\ Xi "' Xn] T , 
where Xj = X(lj) Vj ' e Iq> i s called the x-vector. The j-th ele- 
ment Xj of x-vector is the weight assigned to the corresponding 
terminal state qj. 

In general, the marked language L m {qi) consists of both good 
and bad event strings that, starting from the initial state q,, lead 
to (2,5, an d Q m respectively. Any event string belonging to the 
language L° = L{q,) - L m {qi) leads to one of the non-marked 
states belonging to Q - Q m and L° does not contain any one 
of the good or bad strings. Based on the equivalence classes 
defined in the Myhill-Nerode Theorem, the regular languages 
L(qi) and L m (qi) can be expressed as: D[qi) = Uq k eQ Luk an d 
L m (qd = \Jq t eQ,„ U,k = L m UL m where the sublanguage L a c G ; 
having the initial state is uniquely labelled by the terminal 
state q k , keI Q and L Uj n L uk = V j + k; and L m = Uq k €Q* U.k 
and L~ = \J qk eQ m L^ k are good and bad sublanguages of L m {qi), 
respectively. Then, L° = {j qt<tQm U, k and L(qi) = L° U L+ U L m . 

A signed real measure p! : 2 L( *' — » R = (— oo, +oo) is con- 
structed on the cr-algebra 2 L(9,) for any i e Iq; interested read- 
ers are referred to B43I1 for the details of measure-theoretic def- 
initions and results. With the choice of this cr-algebra, every 
singleton set made of an event string s e L(qi) is a measur- 
able set. By Hahn Decomposition Theorem 14611 . each of these 
measurable sets qualifies itself to have a numerical value based 
on the above state-based decomposition of L(q,) into L°(null), 
L + (positive), and Zr(negative) sublanguages. 

Definition 6. Let u e L(qi, qj) c 2 L(9i) . The signed real mea- 
sure p' of every singleton string set {oj} is defined as: p l ({u>}) := 
rt(qi, u))x{qj). The signed real measure of a sublanguage Ljj C 
L(q t ) is defined as: p uj := p'(L(qi,qj)) = (Stueic,,,^) it(qi, 0))jXj 

Therefore, the signed real measure of the language of a 
DFSA Gj initialized at qj e Q, is defined as p, := p'(L(qi)) = 
Yijei Q ^'(Lij). It is shown in 1430 that the language measure can 
be expressed as p-, = Yijei Q n ijP L j + Xi- Th e language measure 
vector, denoted as // = [p\ p2 • • • is called the //-vector. 

In vector form, we have p = lip +x whose solution is given by 
p = (I - n) ~ l x The inverse exists for terminating plant mod- 
els J44ll because n is a contraction operator [43] due to the strict 
inequality SjHy < !■ The residual = 1 - J^iHy is referred 
to as the termination probability for state € Q. We extend 
the analysis to non-terminating plants with stochastic transition 
probability matrices (i.e. with = 0, V^,- e Q) by renormal- 
izing the language measure ifisill with respect to the uniform 
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termination probability of a limiting terminating model as de- 
scribed next. 

Let II and n be the stochastic event generation and tran- 
sition probability matrices for a non-terminating plant G, = 
(Q, E, 5, qi, Q m ). We consider the terminating plant G,(#) with 
the same DFSA structure {Q,^,6,qi,Q m ) such that the event 
generation probability matrix is given by ( 1 - 9)11 with 9 e (0, 1 ) 
implying that the state transition probability matrix is (1 - 9)11. 

Definition 7 (Renormalized Measure). The renormalized mea- 
sure V g : 2 Uclim — > [-1,1] for the 9-parametrized terminating 
plant Gj(9) is defined as: 



(2) 



The corresponding matrix form is given byvg — 9fi — 9 [I— (1 — 
^n] -1 ^ with 9 e (0, 1). We note that the vector representation 
allows for the following notational simplification v' g (L(qi(9))) — 
V$\. The renormalized measure for the non-terminating plant G, 
is defined to be lime^ + v' g . 

2.1. Event-driven Supervision ofPFSA 

Plant models considered in this paper are deterministic fi- 
nite state automata (plant) with well-defined event occurrence 
probabilities. In other words, the occurrence of events is prob- 
abilistic, but the state at which the plant ends up, given a par- 
ticular event has occurred, is deterministic. Since no emphasis 
is placed on the initial state and marked states are completely 
determined by x, the models can be completely specified by a 
sextuple as: G = (Q, E, 6, XI, x, c &) 

Definition 8 (Control Philosophy). Ifq/ — > q^, and the event cr 

is disabled at state qi, then the supervisory action is to prevent 
the plant from making a transition to the state q^, by forcing it 
to stay at the original state qj. Thus disabling any transition 
a at a given state q results in deletion of the original transition 
and appearance of the self-loop 6(q, cr) — q with the occurrence 
probability of cr from the state q remaining unchanged in the su- 
pervised and unsupervised plants. For a given plant, transitions 
that can be disabled in the sense of Definition\8\are defined to 
be controllable transitions. The set of controllable transitions 
in a plant is denoted c &. Note controllability is state-based. 



Definition 10 (Optimal Supervision Problem). Given a (non- 
terminating) plant G — (Q, S, 5, II, x, c @), the problem is to com- 
pute a supervisor that disables a subset S>* C such that 
V* ^ (Eiementwise) v ' C ^ where v* and are the mea- 

sure vectors of the supervised plants G* and G^ under &i* and 
$ft , respectively. 

Remark 1. The solution to the optimal supervision problem is 
obtained in n2(\ £71/ by designing an optimal policy for a ter- 
' with a sub-stochastic transition probability 
To ensure that the computed 



minating plant [4t 
matrix (1 - 0)11 with 9 6 (0, 1). 
optimal policy coincides with the one for 9-0, the suggested 
algorithm chooses a small value for 9 in each iteration step 
of the design algorithm. However, choosing 9 too small may 
cause numerical problems in convergence. Algorithms reported 
in [2^ ~4^1 computes how small a 9 is actually required, i.e., 
computes the critical lower bound 9 t , thus solving the optimal 
supervision problem for a generic PFSA. It is further shown 
that the solution obtained is optimal and unique and can be 
computed by an effective algorithm. 

Definition 11. Following Remark\l\ we note that algorithms 
reported in %2(i |4^/ compute a lower bound for the criti- 
cal termination probability for each iteration of such that the 
disabling/enabling decisions for the terminating plant coin- 
cide with the given non-terminating model. We define 9 mm 
mi II; 9± where 9\} is the termination probability computed in 
the k th iteration. 

Definition 12. If G and G* are the unsupervised and super- 
vised PFSA respectively then we denote the renormalized mea- 
sure of the terminating plant G*(9 m i n ) as v' # : 2 L **' — > [—1, 1] 
( See Definition 0. Hence, in vector notation we have: v# = 
&minU ~ (1 ~ Ominfi^Y 1 X where n # is the transition probability 
matrix of the supervised plant G* , we note that v# — where 
K is the total number of iterations required for convergence. 

For the sake of completeness, the algorithmic approach is 
shown in Algorithms Q] and [2] 

2.3. Problem Formulation: A PFSA Model of Autonomous 
ition 



2.2. Optimal Supervision Problem: Formulation & Solution 

A supervisor disables a subset of the set ^ of controllable 
transitions and hence there is a bijection between the set of 
all possible supervision policies and the power set 2*. That 
is, there exists 2'*' possible supervisors and each supervisor is 
uniquely identifiable with a subset of ^ and the language mea- 
sure v allows a quantitative comparison of different policies. 

Definition 9. For an unsupervised plant G — ( Q, 2, 6, c €), 

let G 1 and G* be the supervised plants with sets of disabled 
transitions, ffl Q and ffi Q ', respectively, whose mea- 
sures are v' and y*. Then, the supervisor that disables ffi 
is defined to be superior to the supervisor that disables if 

V f ^ (Eiementwise) V* and Strictly Superior if V > (Eiementwise) 
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Figure 1 : (a) shows the vehicle (marked "R") with the obstacle positions shown 
as black squares. The Green4 dot identifies the goal (b) shows the finite state 
representation of the possible one-step moves from the current position, (d) 
shows uncontrollable transitions "u" from states corresponding to blocked grid 
locations to "<?q" 

We consider a 2D workspace for the mobile agents. This 
restriction on workspace dimensionality serves to simplify the 
exposition and can be easily relaxed. To set up the problem, the 
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Algorithm 1: Computation of Optimal Supervisor 
input : P, x, & 

output: Optimal set of disabled transitions S>* 
begin 

Set#°) = 0; /* Initial disabling set */ 

Set =11; /* Initial event prob. matrix */ 
Set flj™ = 0.99, Set k = 1 , Set Terminate = false; 
while (Terminate == false) do 

Compute 0™; /* Algorithm [2] * / 

SetnM = J_fcrf*-i]j 

Compute ; 
for j = 1 to n do 
for i = 1 to n do 

Disable all controllable q/ — > qj s.t. v'. ' < v! ; 

Enable all controllable q t — * qj s.t. v. Si vr ' ; 

Collect all disabled transitions in Sfi ' ; 
if == Srffc-H then 

| Terminate = true; 
else 

L * = 4 + 1 ; 

@* = 3> m ; /* Optimal disabling set */ 

end 



workspace is first discretized into a finite grid and hence the ap- 
proach developed in this paper falls under the generic category 
of discrete planning. The underlying theory does not require 
the grid to be regular; however for the sake of clarity we shall 
present the formulation under the assumption of a regular grid. 
The obstacles are represented as blocked-off grid locations in 



Algorithm 2: Computation of the Critical Lower Bound 

input : P, x 
output: 6+ 
begin 

Set^ = l,Sete curr =0; 
Compute 1 , Mo , Mi, M%; 
for j = 1 to n do 
for i = 1 to n do 

| cl , rr = ^\(^x),-(»X)j\ 
else 

for r = to n do 

if (Mox)i * (M x)j then 

| Break; 
else 

if {M M\x) i * {M M[x) j then 
L Break; 

if r == then 

I _ \{{M o -£P)x) r {{M o -0?)x\ j \ 

I "curr - 8M , ; 

else 

if r >0AND r < n then 

1 _ l(M M l x) r (M M l x) 1 l 

| Vcurt ~ 2H 3 Ml I 

else 

L Qcurt = i ; 

|_ 9* = min(0*, e cun ) ; 

end 



the discretized workspace. We specify a particular location 
as the fixed goal and consider the problem of finding optimal 
and feasible paths from arbitrary initial grid locations in the 
workspace. FigureQla) illustrates the basic problem setup. We 
further assume that at any given time instant the robot occupies 
one particular location (i.e. a particular square in Figure QJ a)). 
As shown in Figure Q] the robot has eight possible moves from 
any interior location. The boundaries are handled by removing 
the moves that take the robot out of the workspace. The possi- 
ble moves are modeled as controllable transitions between grid 
locations since the robot can "choose" to execute a particular 
move from the available set. We note that the number of pos- 
sible moves (8 in this case) depends on the chosen fidelity of 
discretization of the robot motion and also on the intrinsic ve- 
hicle dynamics. The complexity results presented in this paper 
only assumes that the number of available moves is significantly 
smaller compared to the number of grid squares, i.e., the dis- 
cretized position states. Specification of inter-grid transitions 
in this manner allows us to generate a finite state automaton 
(FSA) description of the navigation problem. Each square in 
the discretized workspace is modeled as a FSA state with the 
controllable transitions defining the corresponding state transi- 
tion map. The formal description of the model is as follows: 

Let Gnav = (Q, 2, 6, fl,x) be a Probabilistic Finite State Au- 
tomaton (PFSA). The state set Q consists of states that corre- 
spond to grid locations and one extra state denoted by q Q . The 
necessity of this special state q Q is explained in the sequel. 
The grid squares are numbered in a pre-determined scheme 
such that each qi 6 Q\ {q Q } denotes a specific square in the 
discretized workspace. The particular numbering scheme cho- 
sen is irrelevant. In the absence of dynamic uncertainties and 
state estimation errors, the alphabet contains one uncontrollable 
event i.e. £ = 2c UM su ch that Ec is the set of controllable 
events corresponding to the possible moves of the robot. The 
uncontrollable event u is defined from each of the blocked states 
and leads to q e which is a deadlock state. All other transitions 
(i.e. moves) are removed from the blocked states. Thus, if a 
robot moves into a blocked state, it uncontrollably transitions 
to the deadlock state q e which is physically interpreted to be a 
collision. We further assume that the robot fails to recover from 
collisions which is reflected by making q e a deadlock state. 
We note that q Q does not correspond to any physical grid loca- 
tion. The set of blocked grid locations along with the obstacle 
state q e is denoted as ^Obstacle = Q- Figure [T] illustrates 
the navigation automaton for a nine state discretized workspace 
with two blocked squares. Note that the only outgoing transi- 
tion from the blocked states q\ and q$ is u. Next we augment 
the navigation FSA by specifying event generation probabilities 
defined by the map ft : Q x 2 — > [0, 1] and the characteristic 
state- weight vector specified as x '■ Q ~ * [ — 1> !]■ The character- 
istic state-weight vector [20] assigns scalar weights to the PFSA 
states to capture the desirability of ending up in each state. 



Definition 13. The characteristic weights are specified for the 



6 



navigation automaton as follows: 



Notation 2.1. For notational simplicity, we use 



[ -1 if* = ?e 
Xiqd = I 1 if <?, is the goal 
otherwise 



(3) 



In the absence of dynamic constraints and state estimation 
uncertainties, the robot can "choose" the particular controllable 
transition to execute at any grid location. Hence we assume that 
the probability of generation of controllable events is uniform 
over the set of moves defined at any particular state. 

Definition 14. Since there is no uncontrollable events defined 
at any of the unblocked states and no controllable events defined 
at any of the blocked states, we have the following consistent 
specification of event generation probabilities: Vg; e Q, crj e 



fc(q h a-j) 



No. of controllable events at qj ' ^ ^ 

1, otherwise 



The boundaries are handled by "surrounding" the workspace 
with blocked position states shown as "boundary obstacles" in 
the upper part of Figure Q2 C )- 

Definition 15. The navigation model id defined to have identi- 
cal connectivity as far as controllable transitions are concerned 
implying that every controllable transition or move (i.e. every 
element ofLc) is defined from each of the unblocked states. 

2.4. Decision-theoretic Optimization ofPFSA 

The above-described probabilistic finite state automaton 
(PFSA) based navigation model allows us to compute opti- 
mally feasible path plans via the language-measure-theoretic 
optimization algorithm [20] described in Section [2] Keeping 
in line with nomenclature in the path-planning literature, we 
refer to the language-measure-theoretic algorithm as v* in the 
sequel. For the unsupervised model, the robot is free to exe- 
cute any one of the defined controllable events from any given 
grid location (See Figure[Tfb)). The optimization algorithm se- 
lectively disables controllable transitions to ensure that the for- 
mal measure vector of the navigation automaton is elementwise 
maximized. Physically, this implies that the supervised robot is 
constrained to choose among only the enabled moves at each 
state such that the probability of collision is minimized with the 
probability of reaching the goal simultaneously maximized. Al- 
though v* is based on optimization of probabilistic finite state 
machines, it is shown that an optimal and feasible path plan can 
be obtained that is executable in a purely deterministic sense. 

Let Gnav be the unsupervised navigation automaton and 
^Nav be the optimally supervised PFSA obtained by v* . We 
note that v' # is the renormalized measure of the terminating plant 
G^j AV (0„„„) with substochastic event generation probability ma- 
trix n 8 """ = (1 - 9 m i„)Tl. Denoting the event generating function 
(See Definition O for <G* AV and G* AV (0, m „) as n : Q x E -» Q 
and n e ""'< : Q x X — > Q respectively, we have 



y'#(L(qd) = y#(qd = v#|< 

where v# = 6 min [I - (1 - 6 min )TfY x x 

Definition 16 (v*-path). A v* -path p(qi,qj)from state qi € Q 
to state qj e Q is defined to be an ordered set of PFSA states p = 
{q rn - ■ ■ ,q r J with q Tt 6 Q, V s e {!,■■■ ,M},M < CARD(0 
such that 

q n = qi (5a) 

iru = 1i ( 5b ) 
Vi'Je {!-■■■ ,M), q n *q,j (5c) 
V*G{1,». ,M},Vt£ S, v # (q rt ) ^ y#(q rs ) (5d) 

We reproduce without proof the following key results per- 
taining to v*- planning as reported in 11811 . 

Lemma 1. There exists an enabled sequence of transitions from 
state q,eQ\ ^obstacle to qj 6 Q \ {q Q } in G* Ay if and only if 
there exists a v* -path p(qi, qj) in *E^ AV . 

Proposition 1. For the optimally supervised navigation au- 
tomaton Cj^ av , we have 

V<7«ee\e BSTACLE, LiqdgJ* 

Coroiiary 1. (Obstacle Avoidance:) There exists no v* -path 
from any unblocked state to any blocked state in the optimally 
supervised navigation automaton *G^ AV . 

Proposition 2 (Existence of v* -paths). There exists a v*-path 
Piqii qooM.) f mm an y state qi e Q to the goal ^goal 6 Q if and 
only ifv#(qi) > 0. 



Coroiiary 2. (Absence of Local Maxima:) If there exists a v*- 
pathfrom qi e Q to qj e Q and a v* -path from qi to ^goal then 
there exists a v* -path from q j to ^goal, i-e., 

Vqi,qj 6 Q^PiO^Goal) f\ 3p 2 (qi,qj) => Bpiqj, ?Goal)J 

2.5. Optimal Tradeoff between Computed Path Length & Avail- 
ability Of Alternate Routes 



l TgGoAL) v#(qj) > v#(q k ) 



n"™{q i ,e) = \ 
V 9i 6 2,0"; e 2, n e ""»(q l ,o- j ) = (\-i 



(4a) 

„Wqi,crj) (4b) 



Figure 2: Tradeoff between path-length and robustness under dynamic 
uncertainty: o"2W is the shortest path to ^goal from g,; but the v* plan 
may be criuii due to the availability of larger number of feasible paths 
through q;. 
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Majority of reported path planning algorithms consider min- 
imization of the computed feasible path length as the sole op- 
timization objective. However, the v* algorithm can be shown 
to achieve an optimal trade-off between path lengths and avail- 
ability of feasible alternate routes. If a> is the shortest path to 
goal from state q^, then the shortest path from state q/ (with 
Qi — * Ik) is given by cr 2 w. However, a larger number of feasi- 

CT] 

ble paths may be available from state qj (with q, — > qj) which 
may result in the optimal v* plan to be o-\lq\. Mathematically, 
each feasible path from state qj has a positive measure which 
may sum to be greater than the measure of the single path a> 
from state q^. The condition v#(qj) > v#(qk) would then imply 
that the next state from q, would be computed to be qj and not 
qt- Physically it can be interpreted that the mobile gent is bet- 
ter off going to qj since the goal remains reachable even if one 
or more paths become unavailable. The key results lfl8ll are as 
follows: 

Lemma 2. For the optimally supervised navigation automaton 
G Nav> we have V< 7< 6 Q \ ^Obstacle, 



e L{q t ), v;(M) = 0„ 



1 



TCard(I ( ) 



of)) 



Proposition 3. For q t e Q\ ^obstacle, let qi — > qj -* » 

^Goal be the shortest path to the goal. If there exists qu € Q\ 
^Obstacle with qi — > q k for some cr 2 6 X c such that v#(q k ) > 
v#(qf), then the number of distinct paths to goal from state q^ is 
at least CARD(X C ) + 1. 

The lower bound computed in Proposition [3] is not tight and 
if the alternate paths are longer or if there are multiple 'short- 
est' paths then the number of alternate routes required is sig- 
nificantly higher. Detailed examples can be easily presented to 
illustrate situation where v* opts for a longer but more robust 
plan. 

3. Generalizing The Navigation Automaton To Accommo- 
date Uncertain Execution 

In this paper, we modify the PFSA-based navigation model 
to explicitly reflect uncertainties arising from imperfect local- 
ization and the dynamic response of the platform to navigation 
commands. These effects manifest as uncontrollable transitions 
in the navigation automaton as illustrated in Figure [4] Note, 
while in absence of uncertainties and dynamic effects, one can 
disable transitions perfectly, in the modified model, such dis- 
abling is only partial. Choosing the probabilities of the uncon- 
trollable transitions correctly allows the model to incorporate 
physical movement errors and sensing noise in an amortized 
fashion. 

A sample run with a SEGWAY RMP at NRSL is shown in 
Figure [3] Note that the robot is unable to follow the plan ex- 
actly due to cellular discretization and dynamic effects. Such 
effects can be conceptually modeled by decomposing trajectory 
fragments into sequential combinations of controllable and un- 
controllable inter-cellular moves as illustrated in Figure |4|c). 



We do not need to actually decompose trajectories, it is merely 
a conceptual construct that gives us a theoretical basis for com- 
puting the probabilities of uncontrollable transitions from ob- 
served robot dynamics (as described later in Section [5] and 
therefore incorporate the amortized effect of uncertainties in the 
navigation automaton. 



Experimental Run 
at NRSL with 
SEGWAY RMP 
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12 3 4 

XY Scale in meters | 

Figure 3: Plan execution with SEGWAY RMP at NRSL, Pennstate 

3.1. The Modified Navigation Automaton 

The modified navigation automaton Gj^ 5 = (Q, S, 6, Yl,x) 
is defined similar to the formulation in Section 12.31 with the 
exception that the alphabet S is defined as follows: 



I = Zr U E, 



U M 



(6) 



where T,c is the set of controllable moves from any unblocked 
navigation state (as before), while Sj/c is the set of uncontrol- 
lable transitions that can occur as an effect of the platform dy- 
namics and oather uncertainty effects. We assume that for each 
<x e Sc, we have a corresponding event o~ u in Ej/c> such that 
both <x and <x„ represent the same physical move from a given 
navigation state; but while <x is controllable and may be dis- 
abled, cr u is uncontrollable. Although for 2D circular robots we 
have: CARD(Ec) = CARD(Xf/cX in general, there can exist un- 
controllable moves reflecting estimation errors that cannot be 
realized via a single controllable move. For example, for pla- 
nar rectangular robots with a non-zero minimum turn radius, 
there can be an uncontrollable shift in the heading without any 
change in the xy-positional coordinates, which may reflect er- 
rors in heading estimation, but such a move cannot be executed 
via controllable transitions due to the restriction on the mini- 
mum turn radius. We will discuss these issues in more details 
in the sequel. 
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Uncontrollable 
Move 




Trajectory j 1 
(c) 




m 

J Partially 
[Disabled 



Figure 4: (a) shows available moves from the current state (C) in unsupervised 
navigation automaton, (b) shows the enabled moves in the optimally supervised 
PFSA with no dynamic uncertainty, (c) illustrates the case with dynamic uncer- 
tainty, so that the robot can still uncontrollably (and hence unwillingly) make 
the disabled transitions, albeit with a small probability, i.e., probability of tran- 
sitions e' 2 , e\,e' 4 etc. is small, (d) illustrates the concept of using uncontrollable 
transitions to model dynamical response for a 2D circular robot: Jo is the target 
cell from J-j, while the actual trajectory of the robot (shown in dotted line) ends 
up in 74 . We can model this trajectory fragment as first executing a controllable 
move to Jq and then uncontrollably moving to J4, 



Definition 17. The coefficient of dynamic deviation y(GjJJ^) is 
defined as follows: 



y(G^ v D ) = l-max V n(q h o- u ) 



(7) 



MOD 



Definition 18. The event generation probabilities for G NAV 
defined as follows: V<7; € Q \ {qo,oM_},o~j e 



7r(quo-j) = 



No. of controllable events at q, 

n(qi,o-j), 
1, 



if o-j e S c 
if o-j e T, uc 
otherwise 



and for the goal, we define as before: 

l 



if (Tj € Y.Q 



tt( rr \ — ) No. of controllable events at o, ' J L 

{ 1, otherwise 

Note that we assume there is no uncontrollability at the goal. 
This assumption is made for technical reasons clarified in the 
sequel and also to reflect the fact that once we reach the goal, 
we terminate the mission and hence such effects can be ne- 
glected. 

We note the following: 

• In the idealized case where we assume platform dynam- 
ics is completely absent, we have n(qi, cr u ) = 0, Vg,- e 



Q, Vcr„ e Zf/c implying that y(<G^ v D ) = 1, while in prac- 
tice, we expect y(<G^ v D ) < 1. 

In Definition[17] we allowed for the possibility of n(qi, o~ u ) 
being dependent on the particular navigation states q, e Q. 
A significantly simpler approach would be to redefine the 
probability of the uncontrollable events n{qi,cr u ) as fol- 
lows: 



1 



CARD(0 ^ 



^ 7t(qi,o- u ) 



(8) 



where ttavCck) is me average probability of the uncontrol- 
lable event cr„ being generated. 

The averaging of the probabilities of uncontrollable transitions 
is justified in situations where we can assume that the dynamic 
response of the platform is not dependent on the location of the 
platform in the workspace. In this simplified case, the event 
generation probabilities for Gj^ 1 can be stated as: Vqv e Q \ 
{<7Goal) 



Hqi,o-j) = 



No. of controllable events at 
%Av(o-j), 
1, 



if o-j e £ c 
if o-j e Hue 
otherwise 



The key difficulty is allowing the aforementioned dependence 
on states is not the decision optimization that would follow, but 
the complexity of identifying the probabilities; averaging re- 
sults in significant simplification as shown in the sequel. Thus, 
even if we cannot realistically average out the uncontrollable 
transition probabilities over the entire state space, we could de- 
compose the workspace to identify subregions where such an 
assumption is locally valid. In this paper, we do not address 
formal approaches to such decomposition, and will generally 
assume that the afore-mentioned averaging is valid through- 
out the workspace; the explicit identification of the sub-regions 
is more a matter of implementation specifics, and has little to 
do with the details of the planning algorithm presented here, 
and hence will be discussed elsewhere. In Section [5] we will 
address the computation of the probabilities of uncontrollable 
transitions from observed dynamics. First, we will establish the 
main planning algorithm as a solution to the performance opti- 
mization of the navigation automaton in the next section. 



4. Optimal Planning Via Decision Optimization Under Dy- 
namic Effects 

The modified model can be optimized via the measure- 
theoretic technique in a straightforward manner, using the v*- 
algorithm reported in fllll . The presence of uncontrollable tran- 
sitions in Crj^ 5 poses no problem (as far as the automaton opti- 
mization is concerned), since the underlying measure-theoretic 
optimization is already capable of handling such effects [20]. 
However the presence of uncontrollable transitions weakens 
some of the theoretical results obtained in [18] pertaining to 
navigation, specifically the absence of local maxima. We show 
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that this causes the v* planner to lose some of its crucial advan- 
tages, and therefore must be explicitly addressed via a recursive 
decomposition of the planning problem. 

Proposition 4 (Weaker Version of Proposition |2). There exists 
a v*-path p(qi, ^Goal) from any state q { € Qto the goal <7Goal 6 
Q if v-Aq ,)>(). 

Proof. We note that v#(qi) > implies that there necessarily 
exists at least one string a> of positive measure initiating from 
qi and hence there exists at least one string that terminates on 
<?Goal- The proof then follows from the definition of v* -paths 
(See Definition [16). □ 

Remark 2. Comparing with Proposition [2] we note that the 
only if part of the result is lost in the modified case. 

Remark 3. We note that under the modified model, v#(qi) < 
needs to be interpreted somewhat differently. In absence of 
any dynamic uncertainty, v#(g,) < implies that no path to 
goal exists. However, due to weakening of Proposition\l\(See 
Proposition^, v#(g,) < implies that the measure of the set of 
strings reaching the goal is smaller to that of the set of strings 
hitting an obstacle from the state qi. 

The v* -planning algorithm is based on several abstract con- 
cepts such as the navigation automaton and the formal mea- 
sure of symbolic strings. It is important to realize that in 
spite of the somewhat elaborate framework presented here, v*- 
optimization is free from heuristics, which is often not the case 
with competing approaches. In this light, the next proposition is 
critically important as it elucidates this concrete physical con- 
nection. 

No Connecting Path 



"r^fn" 



Obstacles 



9 > Goal 



Figure 5: Absence of uncontrollable transitions at the goal imply that there is 
no path in the optimally disabled navigation automaton from point A to point 
B (or vice versa), since all controllable transitions will necessarily be disabled 
at the goal. It follows that the stationary probability vector may be different 
depending on whether one starts left or right to the goal. However, note that that 
any two points on the same side have a path (possibly made of uncontrollable 
transitions) between them; implying that the stationary probability vector will 
be identical if either of them is chosen as the start locations. 



Proposition 5. Given that a feasible path exists from the start- 
ing state to the goal, the v* planning algorithm under non- 
trivial dynamic uncertainty (i.e. with y(G^^) < I) maximizes 
the probability Pgoal of reaching the goal while simultaneously 
minimizing the probability pQ of hitting an obstacle. 



Proof. Let p be the stationary probability vector for the 
stochastic transition probability matrix corresponding to the 
navigation automaton GjJ^ 3 , for a starting state from which a 
feasible path to goal exists. (Note that p may depend on the 
starting state; Figure [5] illustrates one such example. However, 
once we fix a particular starting state, the stationary vector p is 
uniquely determined). The selective disabling of controllable 
events modifies the transition matrix and in effect alters p, such 
that p 1 x is maximized |20J], where Af is the characteristic weight 
vector, i.e. ,x t = Recalling that x(9Gom) = hxile) = 

-1 and xiqd = if g, is neither the goal nor the abstract ob- 
stacle state q Q , we conclude that the optimization, in effect, 
maximizes the quantity: 



f = £>GOAL - Pq 



(9) 



Also, note that the optimized navigation automaton has only 
two dump states, namely the goal ^goal an d the abstract ob- 
stacle state <7q. That the goal ^goal is in fact a dump state 
is ensured by not having uncontrollable transitions at the goal 
(See Definition [Toll. Hence we must have 



V<7; 6 Q\ {<7GoAL,<7el> 







implying that 



£>GOAL + PQ = 1 

<A = 2pgoal -1 = 1- 2p Q 



(10) 



(Ha) 
(lib) 



Hence it follows that the optimization maximizes Pgoal an d 
simultaneously minimizes p Q . □ 

Remark 4. It is easy to see that Proposition\5\remains valid if 
X(IGoal) — XGoal > 1. In fact, the result remains valid as long 
as the characteristic weight of the goal is positive and the char- 
acteristic weight of the abstract obstacle state q Q is negative. 

4.1. Recursive Problem Decomposition For Maxima Elimina- 
tion 

Weakening of Proposition Q] (See Proposition!!]! has the cru- 
cial consequence that Corollary [2] is no longer valid. Local 
maxima can occur under the modified model. This is a seri- 
ous problem for autonomous planning and must be remedied. 
The problem becomes critically important when applied to so- 
lution of mazes; larger the number of obstables, higher is the 
chance of ending up in a local maxima. While elimination of 
local maxima is notoriously difficult for potential based plan- 
ning approaches, v* can be modified with ease into a recursive 
scheme that yields maxima-free plans in models with non-zero 
dynamic effects {i.e. with y(Gj^) < 1). 

It will be shown in the sequel that for successful execution of 
the algorithm, we may need to assign a larger than unity char- 
acteristic weight xgoal to the goal </goal- A sufficient lower 
bound for ^goal> with possible dependence on the recursion 
step, is given in Proposition [6] The basic recursion scheme can 
be described as follows (Also see the flowchart illustration in 
Algorithm[3]i: 
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1. In the first step (i.e., at recursion step k - 1) we execute 
v*-optimization on the given navigation automaton Gjjf?° 
and obtain the measure vector v# [tl . 

2. We denote the set of states with strictly positive measure 
as Q/t (k denotes the recursion step), i.e., 



Q k = { qi eQ:v # l %>0} 



(12) 



3. If Qk - Qk-i, the recursion terminates; else we update the 
characteristic weights as follows: 



6 Q k , xiqd = X G 



(13) 



and continue the recursion by going back to the first step 
and incrementing the step number k. 



Algorithm 3: Flowchart for recursive v* -planning 



1. Set: Vq t e Q k 

X(lO = /^GOAL 

2. Eliminate Uncont. 
transitions from 

all q t e Q k 



Problem: 

IP MOD 



Save vector No , „ „ ., Yes 

[*] SO. - <Jk-l '■ 



V# L 



Initialize: 

k = 0, Q = (<?goal) 

Execute v* Set k = k + 1 

Define: 

iQk = {% 6 Q : v/ 1 !, > Oj 

Terminate 



Proposition 6. If 6^}- is the critical termination probability 
(See Deiinition \ll\ for the Y* -optimization in the k' h recursion 
step of Algorithm^ then the following condition 



[k] 



*G 



CARD(Sc) /l 



1 



n[k] 



- - 1 

r 



(14) 



is sufficient to guarantee that the following statements are true: 

1. If there exists a state q- t e Q\ Qkfrom which at least one 
state qi 6 Qk is reachable in one hop, then v#^|,- > 0. 

2. The recursion terminates in at most CARD(0 steps. 

3. For the k recursion step, either Qk ^ Qk-\ or no feasible 
path exists to ^goal from any state qi € Q\ Qk-i- 



( 1 ^m'm ) 




(l-y)(l-fiU.) ( q k . , 



Figure 6: Illustration for Proposition |6] Uncontrollable events and strings are 
shown in dashed. 



Proof. Statement 1: 

We first consider the first recursion step, i.e., the case where 
k — and Qk - {<?goal} (See Algorithm |3). We note that the 
goal <7goal achieves the maximum measure on account of the 
fact that only ^goal nas a positive characteristic weight, i.e., we 
have 



V?/ € Q, v # l11 |goal 



(15) 



It follows that all controllable transitions from the goal will 
be disabled in the optimized navigation automaton obtained at 
the end of the first recursion step (See Definition [8] and Algo- 
rithms Q]&[2]|, which in turn implies that the non-renormalized 
measure of the goal (at the end of the first recursion step) is 
given by^GoALe^-- 

The Hahn Decomposition Theorem 114611 . allows us to write: 

v#| f = v#(L + (qd) + v # (L-(qd) (16) 

where L + (qf), Lr(qi) are the sets of strings initiating from state 
q, that have positive and negative measures respectively. 

Let qi e Q\ {<7goal I such that <7goal is reachable from q, in 
one hop. We note that since it is possible to reach the goal in 
one hop from q„ we have: 

y(l-#mm) *GOAL 



V#(L + ( qi )) ^ 0n. 



(17) 



CARD(I C ) 6 min 

where the first term arises due to renormalization (See Defini- 
tion|7]i, the second term denotes the probability of the transition 
leading to the goal and the third term is the non-renormalized 
measure of the goal itself (as argued above). Since it is obvious 
that the goal achieves the maximum measure, the transition to 
the goal will obviously be enabled in the optimized automaton, 
which justifies the second term. It is clear that there are many 
more strings of positive measure (e.g. arising due to the self 
loops at the state qi that correspond to the disabled controllable 
events that do not transition to the goal from qi) which are not 
considered in the above inequality (which contributes to mak- 
ing the left hand side even larger); therefore guaranteeing the 
correctness of the lower bound stated in Eq.fPTli. 

Next, we compute a lower bound for v#(L~(qi)). To that ef- 
fect, we consider an automaton G' identical to the navigation 
automaton at hand in ever respect, but the fact that the <7goal 
has zero characteristic. We denote the state corresponding to 
q, in this hypothesized automaton as q' r and the set of al states 
in G' as Q'. We claim that, after a measure-theoretic optimiza- 
tion (i.e. after applying Algorithms[T]and[2]i, the measure of q' r 
denoted as v*(q' i ), satisfies: 

v*(«D^-(i-r) (is) 

To prove the claim in Eq. ( fTSI l, we first note that denoting the 
renormalized measure vector for G' before any optimization as 
V , the characteristic vector as^-' and for any termination prob- 
ability 9 e (0, 1), we have: 



Hv'lU = l|0[i-(i-0)n]-Viu 

^ pp-a-fl)!!]- 1 ! 

which follows from the following facts: 



x 1 = 1 (19) 
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1. For all 6 e (0, 1], 0[I - (1 - 0)11] _1 is a row-stochastic 
matrix and therefore has unity infinity norm 1 19] 

2- \\x'\\co = 1, since all entries of x' are except for the state 
corresponding to the obstacle state in the navigation au- 
tomaton, which has a characteristic of - 1 . 

Since the only non-zero characteristic is -1, it follows that no 
state in G' can have a positive measure and we conclude from 
Eq. (QUl that: 



Vq'jCQ', v(^)e[-l,0] 



(20) 



Note that q' t is not blocked itself (since we chose q t such that a 
feasible 1-hop path to the goal exists from qf). Next, we sub- 
ject G' to the measure-theoretic optimization (See AlgorithmsQ] 
&H), which disables all controllable transitions to the blocked 
states. In order to compute a lower bound on the optimized 
measure for the state q' r (denoted by v*{q' j ) ), we consider the 
worst case scenario where all neighboring states that can be 
reached from q'. in single hops are blocked. Denoting the set 
of all such neighboring states of qi by N(q'^), we have: 



n". 



-lx(l-y) 
(21) 



where IT^ is the probability of the uncontrollable transition 
from q' { to the neighboring state q'.. Note that we can write 
Eq. ( f2TT > in the worst case scenario where each state in Niq^) 
is blocked, since all controllable transitions from q' { will be dis- 
abled in the optimized plant under such a scenario, and only the 
uncontrollable transitions will remain enabled; and the proba- 
bilities of all uncontrollable transitions defined at state q, sums 
to 1 — y. It is obvious that the lower bound computed in 
Eq. (|2T1 i also reflects a lower bound for v#(L~{qi)), since addi- 
tion of state(s) with positive characteristic or eliminating obsta- 
cles cannot possibly make strings more negative. Furthermore, 
recalling that the goal <7goal is actually reachable from state q t 
by a single hop, it follows that not all neighbors of q, in the 
navigation automaton are blocked, and hence we have the strict 
inequality: 



v # (zrfe))>-(i-y) 



(22) 



Combining Eqns. (TTTb and (|22| >. we note that the following 
condition is sufficient for guaranteeing v#|, > 0. 



7(1 



CARD(2]c) 



x;tgoal > 1 -y 



(23) 



which after a straightforward calculation yields the bound 
stated in Eq. [141 and the Statement 1 is proved for the first re- 
cursion step, i.e. for Qk = {<7Goal}- 

To extend the argument to later recursion steps of Algo- 
rithmic] i.e., for k > 0, we argue as follows. Let Qk ^ {^goal) 
and we have eliminated all uncontrollable transitions from all 
qj G Qk (as required in Algorithm [3}. Further, let q t e Q\ Qk 
such that it is possible to reach some qj e Q in a single control- 
lable hop, i.e. 



controllable 



qj, qj g Qk 



(24) 




a :<7GoalV 



Figure 7: Illustration for Statement 1 of Proposition [6] Note that even if Qa 
has multiple states, q:. ,q~j 2 , qj } , the measure of any string (say o"cri<X[0"2) from 
qi is the same as if q-, was directly connected to the goal ^qoal with all con- 
trollable events disabled at ^goal- The bottom plate illustrates this by showing 
the hypothetical scenario where g, is connected to </goal by °" and Cl> °"2 are 
controllable events disabled at </goal- Note that for this argument to work, we 
must eliminate uncontrollable transitions from all states in Q*. 



We first claim that 

Vqj€Q k ,q r tQ k , v # w |; > v # M |, 



(25) 



which immediately follows from the fact that the optimal con- 
figuration (of transitions from states in Qk) at the end of the 
v*-optimization at the k th step would be to have all control- 
lable transitions from states qj e Qk enabled if and only if 
the transition goes to some state in Qk, since in that case ev- 
ery string initiating from qj terminates on a state having char- 
acteristic ^goal (since there is no uncontrollability from states 
within Qk by construction), whereas if a transition qj > 

controllable 

q r , where qj 6 Qk, q r t Qk allows strings which end up in zero- 
characteristic states and also (via uncontrollable transitions) on 
negative-characteristic states. 

Eq. d25l l implies that no enabled string exits Qk- It therefore 
follows that every string era) starting from the state q,, with a> e 
X£ and d(qi, cr) e Qk (i.e., cr leads to some state within Qk) 
has exactly the same measure as if qi is directly connected to 
<?Goal an d a ll controllable transitions are disabled at ^goal (See 
Figure|7]for an illustration). This conceptual reduction implies 
that Eq. ( fTTI i is valid when Qk ^ {<?goal I since the lower bound 
for v#(L + (<7,)) can t> e computed exactly as already done for the 
case with Qk = {^goal!- The argument for obtaining the lower 
bound for v#(L~(qi)) is the same as before, thus completing the 
proof for Statement 1 for all recursion steps of Algorithms- 
Statement 2: 

Let Qr C q be the set of states from which a feasible path to the 
goal exists. IfCARD(£>«) = 1, then we must have Qr = {^goal} 
and the recursion terminates in one step. In general, for the k th 
recursion step, let Card(Qa) < CARD(2«)- Since there exists 
at least one state, not in Qk, from which a feasible path to the 
goal exists, it follows that there exists at least one state qj from 
which it is possible to reach a state in Qk in one hop. Using 
Statement 1, we can then conclude: 

Qk + l *Qk^> CARD(£ t+1 ) ^ CARD(ftt) + 1 
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CARD(e* + i) ^k+l 



(26) 



which immediately implies that the recursion must terminate in 
at most CARD(g) steps. 
Statement 3: 

Follows immediately from the argument used for proving State- 
ment 2. □ 

Remark 5. The generality ofEq. (1 14b is remarkable. Note that 
the lower bound is not directly dependent on the exact structure 
of the navigation automaton; what only matters is the number 
of controllable moves available at each state, the coefficient of 
dynamic deviation yCG^Av ) an< ^ ^ e critical termination prob- 
ability 6 m j„. Although the exact automaton structure and the 
probability distribution of the uncontrollable transitions are not 
directly important, their effect enters, in a somewhat non-trivial 
fashion, through the value of the critical termination probabil- 
ity. The reader might want to review Algorithm^ (See also 
1 1^ 2~dil ) which computes the critical termination probability in 
each step of the v* -optimization for a better elucidation of the 
aforementioned connection between the structure of the navi- 
gation automaton and xgoal- 

The dependencies of the acceptable lower bound for ^goal 
with the coefficient of dynamic deviation y(G^?^), as com- 
puted in Proposition [6] is illustrated in Figures [SJa) and (b). 
The key points to be noted are: 

1. Asy(G^ v °) -> + ,^goal -» +°°; which reflects the phys- 
ical fact that if no events are controllable, then we cannot 
optimize the mission plan no matter how large ^goal is 
chosen. 

2. As y(&M°n ) -» 1, ^goal -» 0; which implies that in the 
absence of dynamic effects any positive value of ^goal 
suffices. This reiterates the result obtained with ^-qoal = 1 
in iH. 

3. As the number of available controllable moves increases 
(See Figure |8ja)), we need a larger value of ^goal ! sim- 
ilarly if the critical termination probability <?„„■„ is large, 
then the value of ^goal required is also large (See Fig- 
ureHb)). 

4. The functional relationships in Figures [HJ a) and (b) estab- 
lish the fact that for relatively smaller number of control- 
lable moves, a large value of y(<GjJ^ > ) and a small termi- 
nation probability, a constant value of ^goal = 1 may be 
sufficient. 

4.2. Plan Assembly & Execution Approach 

The plan vectors v# [i] (Say, there are K of them, i.e., k e 
{!,■■■ , K}) obtained via the recursive planning algorithm de- 
scribed above, can be used for subsequent mission execution in 
two rather distinct ways: 

1 . ( The Direct Approach:) 

• At any point during execution, if the current state 
qi € Qk for some k e {1, • ■ • ,K], then use the gra- 
dient defined by the plan vector v# [ ^ to decide on 



< o 

o 
u 



-4 















Card (icb=f^T^^ 










i 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

y(« > 



(a) e mln = 0.01 



5 -A 



„ : &min — 0-1 

e mln VoM- 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

y«W ► 



(b) Card(E c ) = 8 



Figure 8: Variation of the acceptable lower bound lor^Q OAL with 7(GjJ^ > ). (a) 
The set of controllable moves is expanded from CARD(Zc) = 4toCARD(£c) = 
100 while holding 

Qmin — 0.01 (b) The critical termination probability 6 nl j n is 
varied from 0.001 to 0.1 while holding CARD(Sc) = 8. Note the lines are 
almost coincident in this case. 



the next move, i.e., qj is an acceptable next state if 



> v^ k \ and for states qc that can be reached 
from the current state g, via controllable events, we 
havev # M| ; -^ v/%. 

• if Vk € {1, ■ • • , K}, qi i Qk, then terminate operation 
because there is no feasible path to the goal. 

• Note that this entails keeping K vectors in memory. 
2. (The Assembled Plan Approach:) 

• Use vjf'^ , k E { 1 , • ■ • , K] to obtain the assembled plan 
vector v# A following Algorithm [4] which assigns a 
real value v# A |, to each state g, in the workspace. We 
refer to this map as the assembled plan. 

• Make use of the gradient defined by v# A to reach the 
goal, by sequentially moving to states with increas- 
ing values specified by the assembled plan, i.e.,. if 
the current state is g, e Q, then qj is an acceptable 
next state if v# A | ; - > v# A |; and for states qt that can 
be reached from the current state g, via controllable 
events, we have v# A | ; - ^ v# A \f. 
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Algorithm 4: Assembly of Plan Vectors 



input : v t m ,k = 1, • • ■ ,K 

(Plan Vectors) output: v# A (Assembled Plan) 

begin 

Set v # A = 0; 
for k = 1 : K do 
for i e Q do 
v#""''|,- = 0; 
if v # L *- 1J | y > Othen 

j tmp. 



9 
10 
11 
12 
13 
14 
15 
16 

17 end 



/* Zero vector */ 



1, = i; 



else 



if v# M |i > Othen 
I, = v # W|,-: 



tmp 



endfor 



endif 
endif 
endfor 

A 



A , lm P 

v# + v # 



• We show in the sequel that if v# A |, < 0, then no fea- 
sible path exists to the goal. 

Before we can proceed further, we need to formally establish 
some key properties of the assembled plan approach. In partic- 
ular, we have the following proposition: 

Proposition 7. 1 . For a state qj e Q, a feasible path to the 
goal exists from the state q lt if and only i/v# A |; > 0. 

2. The assembled plan v# A is free from local maxima, i.e., if 
there exists a v* -path (w.r.t. to v# A ) from q-, e Q to qj e Q 
and a v* -path from qi to ^goal then there exists a v* -path 
from qj to Q'goal, i.e., 

Vquqj e g^pifegGoAL) /\ 3pi(qi,qj) => 3p(qj, <?Goal)) 

3. If a feasible path to the goal exists from the state qi, then 
the agent can reach the goal optimally by following the 
gradient of v# A , where the optimality is to be understood 
as maximizing the probability of reaching the goal while 
simultaneously minimizing the probability of hitting an ob- 
stacle (i.e. in the sense stated in Proposition]^. 

Proof. Statement 1: 

Let the plan vectors obtained by the recursive procedure stated 
in the previous section be v#^' (Say, there are K of them, i.e., 
k e {1, • ■ • , K}) and further let the current state g, e Qk for 
some k e {1, • ■ • ,K}. We observe that on account of Propo- 
sition |U if k = 1, then v#^|; > is sufficient to guaran- 
tee that there exists a v*-path p{q/, ^goal) w.r.t the plan vec- 
tor v# [1] . We further note that v# [1] |, <= => ( Q\ (See 
Eq. (O), implying that v# [1I |; > is also necessary for the ex- 
istence of p(qi,qGoAi)- Extending this argument, we note that, 
for k > 1, a v*-path p(qj,qj) with qj e Qk-\ exists (with re- 
spect to the plan vector v#^) if and only if V#^ \ > 0. Not- 
ing that v# [k] \i > o v# A |, > 0, (See Algorithm |4]i we con- 
clude that a v*-path p{qj,qj) with qj e (2i_i exists (with re- 
spect to the plan vector v# [i:I ) if and only if v# A |, > 0. Also, 



since qj E 2/t-i A q, E £>a, it follows from Algorithm |U that 
v# A |j ^ 1 + v# A |,- > 0. It follows that the same argument can be 
used recursively to find v*-paths p{qj,q[ { ), ■ ■ ■ ,p(qj, ^goal) if 
and only if v# A |, > 0. 

To complete the proof, we still need to show that if there 
exists a feasible path from a state g, to the goal ^goal. then 
there exists a v*-path p(qj, ^goal)- We argue as follows: Let 

qi = <?ri -* qa -* » ?(•„,_, -> q r „, = <7Goal be a feasible path 

from the state q, to ^goal- Furthermore, assume if possible that 



V/tv# L J l; ^ 



(27) 



i.e., there exists no v*-path from qi to <7goal w.r.t v# A . We 
observe that since it is possible to reach ^goal from q Fm in 
one hop, using Proposition|6]we have: 



> => q rm _ x E Qi 



We further note: 



v# cl] |. 
<0 



> 

[2] 1 



qn„-2 e Qi 

2 > => q n 



(28) 



(29) 
(30) 



Hence, we conclude either q rm 2 £ Q2 or q Tm 2 e Q\, It fol- 
lows by straightforward induction that either q\ e 2m- 1 or 
q\ 6 Qm-2, which contradicts the statement in Eq. t27i . There- 
fore, we conclude that if a feasible path to the goal exists from 
any state q t , then a v*-path p(q t , ^goal) (w.r.t v# A ) exists as 
well. This completes the proof of Statement 1 . 
Statement 2: 

Given states quqj e Q, assume that we have the v* -paths 
Pi(<?i> <?Goal) and p2(qi, qj). We observe that: 

3pi(g,,<?GoAL) => v# A |, > (See Statement 1) (31a) 
3p 2 (<7;, qj) => v# A |j ^ v# A |,- (See Definition[[6]i (31b) 
=> v# A |/ > => 3p(qj, <?goal) (See Statement 1) (31c) 

which proves Statement 2. 
Statement 3: 

Statements 1 and 2 guarantee that if a feasible path to the goal 
exists from a state q, e Q, then an agent can reach the goal 
by following a v*-path (w.r.t v# A ) from q t , i.e., by sequentially 
moving to states which have a better measure as compared to 
the current state. 

We further note that a v*-path a> w.r.t v# A from any state 
qi to <7goal can be represented as a concatenated sequence 
o>iO>2 • • • 0Jr • • • w,„ where u> r is a v*-path from some interme- 
diate state qj e Q s , for some s e {l,--- ,K], to some state 
qi 6 Qs-i- Since the recursive procedure optimizes all such 
intermediate plans, and since the outcome "reached goal from 
q" can be visualized as the intersection of the mutually inde- 
pendent outcomes "reached Q s from g, e Q s -\ \ "reached Q s +\ 
from qj € Q" , ■ ■ ■ , "reached ^goal from qi e Q", the overall 
path must be optimal as well. This completes the proof. □ 

We compute the set of acceptable next states from the fol- 
lowing definition. 
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Definition 19. Given the current state q; e Q, Q next is the set 

of states satisfying the strict inequality: 

Q nexr = {qj e Q : v# A \j > v# A |,} (32) 

We note that Proposition[7]implies that Q nex , is empty if and 
only if the current state is the goal or if no feasible path to the 
goal exists from the current state. 

5. Computation of Amortized Uncertainty Parameters 



2. Localization errors: Estimation errors arising from sensor 
noise, and the limited time available for post-processing 
exteroceptive data for a moving platform. Even if we as- 
sume that the platform is capable of processing sensor data 
to eventually localize perfectly for a static robot, the fact 
that we have to get the estimates while the robot pose is 
changing in real time, implies that the estimates lag the 
actual robot configuration. Thus, this effect cannot be ne- 
glected even for the best case scenario of a 2D robot with 
an accurate global positioning system (unless the platform 
speed is sufficiently small). 
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Figure 9: (a) Model for 2D circular robot(b) Numerical integration technique 
for computing the dynamic parameters for the case of a circular robot modele.g. 
a SEGWAY RMP 200 

Specific numerical values of the uncertainty parameters, i.e. 
the probability of uncontrollable transitions in the navigation 
automaton can be computed from a knowledge of the average 
uncertainty in the robot localization and actuation in the config- 
uration space. For simplicity of exposition, we assume a 2D cir- 
cular robot; however the proposed techniques are applicable to 
more general scenarios. The complexity of this identification is 
related to the dynamic model assumed for the platform (e.g. cir- 
cular robot in a 2D space, rectangular robot with explicit head- 
ing in the configuration etc.), the simplifying assumptions made 
for the possible errors, and the degree of averaging that we are 
willing to make. Uncertainties arise from two key sources: 

1. Actuation errors: Inability of the robot to execute planned 
motion primitives exactly, primarily due to the dynamic 
response of the physical platform. 



In our approach, we do not distinguish between the differ- 
ent sources of uncertainty, and attempt to represent the overall 
amortized effect as uncontrollability in the navigation automa- 
ton. The rationale for this approach is straightforward: we visu- 
alize actuation errors as the uncontrollable execution of transi- 
tions before the controllable planned move can be executed, and 
for localization errors, we assume that any controllable planned 
move is followed by an uncontrollable transition to the actual 
configuration. Smaller is the probability of the uncontrollable 
transitions in the navigation automaton, i.e., larger is the coef- 
ficient of dynamic deviation for each state, smaller is the un- 
certainty in navigation. From a history of observed dynamics 
or from prior knowledge, one can compute the distribution of 
the robot pose around the estimated configuration (in an amor- 
tized sense). Then the probability of uncontrollable transitions 
can be estimated by computing the probabilities with which the 
robot must move to the neighboring cells to approximate this 
distribution. The situation for a 2D circular robot is illustrated 
in Figure |9|a), where we assume that averaging over the ob- 
servations lead to a distribution with zero mean-error; i.e., the 
distribution is centered around the estimated location in the con- 
figuration space. For more complex scenarios (as we show in 
the simulated examples), this assumption can be easily relaxed. 
We call this distribution the deviation contour (ID) in the sequel. 
The amortization or averaging is involved purely in estimating 
the deviation contour from observed dynamics (or from prior 
knowledge); a simple methodology for which will be presented 
in the sequel. However, we first formalize the computation of 
the uncertainty parameters from a knowledge of the deviation 
contour. 

For that purpose, we consider the current state in the naviga- 
tion automaton to be q/. Recall that qi maps to a set of possible 
configurations in the workspace. For a 2D circular robot, g, cor- 
responds to a set of x - y coordinates that the robot can occupy, 
while for a rectangular robot, qi maps to a set of (x, y, 6) coor- 
dinates. The footprint of the navigation automaton states in the 
configuration space can be specified via the map f : Q — » 2 C , 
where C is the configuration space of the robot. In general, for 
a given current state qi, we can identify the set N{qi) c Q of 
neighboring states that the robot can transition to in one move. 
The current state q, is also included in N{qi) for notational sim- 
plicity. In case of the 2D circular robot model considered in 
this paper, the cardinality of Niqi) is 8 (provided of course that 
q t is not blocked and is not a boundary state). For a position 
s € %(qi) of the robot, we denote a neighborhood of radius r of 
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the position s in the configuration space as S,, r . The normalized 
"volume" intersections of S s r with the footprints of the states 
included in N{qi) in the configuration space can be expressed 
as : 



L dx 

F J (s,r) = - r ^ ,S qj eN( qi ) 



(33) 



dx 



where Aj = S s , r £(<lj) and dx is the appropriate Lebesgue 
measure for the continuous configuration space. 

We observe that the expected or the average probability of 
the robot deviating to a neighboring state qj e N{qi) from a 
location s e £(g,) is given by: 



Jo 



Fj(s, r)Bdr 



(34) 



Hence, the probability of uncontrollably transitioning to a 
neighboring state qj from the current state q, is obtained by 
considering the integrated effect of all possible positions of the 
robot within %{q,), i.e. we have: 



/ r r > 

J Kg,) JO 



(s, r)Ddrds 



V f f Fj(s,r)Hdrds 



(35) 



where dr, ds are appropriate Lebesgue measures on the contin- 
uous configuration space of the robot. It is important to note 
that the above formulation is completely general and makes no 
assumption on the structure of the configuration space, e.g., the 
calculations can be carried out for 2D circular robots, rectan- 
gular robots or platforms with more complex kinematic con- 
straints equally well. Figures [T3l a)-(c) illustrate the computa- 
tion for a circular robot with eight controllable moves, .e.g., the 
situation for a SEGWAY RMP. The 2D circular case is however 
the simplest, where any state that can be reached by an uncon- 
trollable transition, can also be reached by a controllable move. 
For more complex scenarios, this may not be the case. For ex- 
ample, in the rectangular model, with constraints on minimum 
turn radius, the robot may not be able to move via a controllable 
transition from (x,y, h\) to (x,y, h.%), where h,,i - 1,2 is the 
heading in the initial and final configurations. However, there 
most likely will be an uncontrollable transition that causes this 
change, reflecting uncertainty in the heading estimation (See 
Figure [TOt. Also, one can reduce the averaging effect by con- 
sidering more complex navigation automata. For example, for 
a 2D circular robot, the configuration state can be defined to be 
{xp r eviou S ,y previous, x currem ,y cumnl ), i.e. essentially considering a 
4D problem. The identification of the uncertainty parameters 
on such a model will capture the differences in the uncontrol- 
lable transition probabilities arising from arriving at a given 
state from different directions. While the 2D model averages 
out the differences, the 4D model will make it explicit in the 
specification of the navigation automaton (See Figure [151). In 
the sequel, we will present comparative simulation results for 



these models. Note, in absence of uncertainty, the 4D imple- 
mentation is superfluous; (x prev i OUS ,y prev i OUS , x current ,y current ) has 
no more information than (x currenr ,y currmt ) in that case. 
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Figure 10: A rectangular robot unable to execute zero-radius turns. There exists 
uncontrollable transitions that alter heading in place, which reflects uncertain- 
ties in heading estimation, although there are no controllable moves that can 
achieve this transition 

Next, we present a methodology for computing the relevant 
uncertainty parameters as a function of the robot dynamics. We 
assume a modular plan execution framework, in which the low- 
level continuous controller on-board the robotic platform is se- 
quentially given a target cell (neighboring to the current cell) to 
go to, as it executes the plan. The robot may be able to reach the 
cell and subsequently receives the next target, or may end up in 
a different cell due to dynamic constraints, when it receives the 
next target from this deviated cell as dictated by the computed 
plan. The inherent dynamical response of the particular robot 
determines how well the patform is able to stick to the plan. We 
formulate a framework to compute the probabilities of uncon- 
trollable transitions that best describe these deviations. 

Definition 20. The raw deviation A«(f) as a function of the op- 
eration time t is defined as follows: 



A«(O = 0(p(O,f(O) 



(36) 



where p(t) is the current location of the robot in the workspace 
coordinates, £(t) is the location of the point within the current 
target cell which is nearest to the robot position p(t) ( See Fig- 
ure \ll\ . and ©(■, •) is an appropriate distance metric in the con- 
figuration space. 

The robot will obviously take some time to reach the target 
cell, assuming it is actually able to do so. We wish to eliminate 
the effect of this delay from our calculations, since a platform 
that is able to sequentially reach each target cell, albeit with 
some delay, does not need the plan to be modified. Further- 
more, unless velocity states are incorporated in the navigation 
automata, the plan cannot be improved for reducing this delay. 
We note that the raw deviation A«(f) incorporates the effect of 
this possibly variable delay and needs to be corrected for. We 
do so by introducing the delay corrected deviation A(f) as fol- 
lows: 

Definition 21. The delayed deviation Ad(t, if) is defined as: 

A d (t,Tf) = ®(p(t + r,(t)),at)) (37) 
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where 77(f) is some delay function satisfying Vr € R, tj(t) > 0. 

Definition 22. The delay corrected deviation Ait) as a function 
of the operation time t is defined as: 

V?eR, Aft)= arginf \\Ki(t,T])\\ (38) 

Ad(t, 77) : Vt E R, t/(t) > 

Note that Definition [22] incorporates the possibility that the 
delay may vary in the course of mission execution. We will 
make the assumption that although the delay may vary, it does 
so slowly enough to be approximated as a constant function 
over relatively short intervals. 

If we further assume that we can make observations only at 
discrete intervals, we can approximately compute Ait) over a 
short interval I = [?,-„,>, tfi m i] as follows: 

Vf£/,A(f) = argmin \\®(p(t + 77), £(r))|| (39) 

Furthermore, the approximately constant average delay 77* over 
the interval I can be expressed as: 



Since the delay may vary slowly, the computed value of 77* may 
vary from one observation interval to another. For each interval 
h £ ■ ■ ■ ,Im), one can obtain the approximate probability 
distribution of the delay corrected deviation Ait), which is de- 
noted as ID 11 '. Therefore, from a computational point of view, 
D [A] is just a histogram constructed from the Ait) values for the 
interval 4 (for a set of appropriately chosen histogram bins or 
intervals). For a sufficiently large number of observation in- 
tervals {7i, •• • ,Im], one can capture the deviation dynamics of 
the robotic platform by computing the expected distribution of 
A(f), i.e. computing E), which can be estimated simply by: 



1 M 

d = — y d w 

k=\ 



(41) 



77* =argmin\\®ipit + T]),£it))\\ 

?;eNu(0) 



(40) 



Once the distribution for the delay corrected deviation has been 
computed, we can proceed to estimate the probabilities of the 
uncontrollable transitions, as described before. Determination 
of the uncertainty parameters in the navigation model then al- 
lows us to use the proposed optimization to compute optimal 
plans which the robot can execute. We summarize the sequen- 
tial steps in the next section. 

6. Summarizing v* Planning & Subsequent Execution 

The complete approach is summarized in Algorithm [5] The 
planning and plan assembly steps (Lines 2 & 3) are to be done 
either offline or from time to time when deemed necessary in 
the course of mission execution. Replanning may be necessary 
if the dynamic model changes either due to change in the en- 
vironment or due to variation in the operational parameters of 
the robot itself, e.g., unforeseen faults that alter or restrict the 
kinematic or dynamic degrees of freedom. Onwards from Line 
4 in Algorithm [5] is the set of instructions needed for mission 
execution. Line 5 computes the set of states to which the robot 
can possibly move from the current state. We select one state 
from the set of possible next states which have a strictly higher 
measure compared to the current state in the computed plan. 
It is possible that the set of such states Q„ ext (See Line 6) has 
more than one entry. Choice of any element in Q„ ex , as the next 
desired state is optimal in the sense of Proposition[5]. However, 
one may use some additional criteria to aid this choice to yield 
plans suited to the application at hand, e.g., to minimize change 
of current direction of travel. For example one may choose the 
state from Q nex , that requires least deviation from the current 
direction of movement, to minimize control effort. In general, 
we can penalize turning using a specified penalty /3 £ [0, 1 ] as 
follows: 

Definition 23. Given a turning penalty ft e [0, 1], the turn pe- 
nalized measure values on the set of possible next states Q„ ex t 
is computed as follows: 

e Q next , Aq) = (1 ~ /3)v#Hq) +j3cosihiq)) (42) 

where hiq) is the heading correction required for transitioning 
to q £ Q nex t, which for 2D circular robots is calculated as the 
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Figure 12: Effect of increasing turning penalty /? on making turns in choice of local transitions post v* -optimization. Note that the potential gradient is identical 
in all four cases. The start state is (8, 15) as shown. The gradient shown (by the arrows) is for the computed measure vector v# A , and hence is identical for all four 

cases 



angular correction between the line joining the center of the 
current state to that of q, and the line joining the center of the 
last state with that of the current one. the direction of the last 
state. The robot then chooses q ne xt as the state q e Q nex t which 
has the maximum value for v^(q). 

The effect of penalizing turns is shown in the Figure[T2] Note 
that for maximum turn penalty, the computed plan is almost 
completely free from kinks. Also, note that the v* optimization 
ensures that all these plans have the same probability of success 
and collision. 

As stated in Line 8, the robot may not be successful to actu- 
ally transition to the next chosen state due to dynamic effects. 
In particular, if the state that the robot actually ends up in has 



a non-positive measure (due to uncertainties and execution er- 
rors), then execution is terminated and the mission is declared 
unfeasible from that point (See Line 14). 

It is important to note that if a particular configuration maps 
to a navigation state with non-positive measure, then no feasi- 
ble path to goal exists from that configuration, irrespective of 
uncertainty effects. This underscores the property of the pro- 
posed algorithm that it finds optimal feasible paths; even if the 
only feasible path is very unsafe, it still is the only feasible path; 
and is therefore the optimal course of action (See Proposition|7] 
Statement 3). 



7. Verification & Validation 



Algorithm 5: Summarized Planning & Mission Execution 



input : 
begin 

/* 



Model G*™ 



9 
10 
11 
12 
13 
14 
15 
16 
17 
18 end 



Planning & Plan Assembly — > 

Compute decomposed plans ; /* Algorithm [3l 
Compute assembled plan v# A ; /* Algorithm \4\ 
I * < Mission Execution — > 
while true do 

Find set of neighbors: N(qt) = {q e Q : 3cr e Sc s.t. qi q] 
Compute Q„ ex , = {qj e N(qt) : e N(g t ), v# A |j > v # A | t ); 
Choose one state q„ exl from set Q next \ 
Attempt to move to q mxt \ I * May be unsuccessful 
*/ 

Read current state q\\ I * Possibly q, • + q next 
if q t == <7 G0AL then 

| Mission Successful Terminate Loop; 
else 

if v# A | ; <; Othen 

| Mission Failed Terminate Loop; 
endif 
endif 
endw 



*/; 

*/ 
*/ 
*/; 



In this section we validate the proposed planning algorithm 
via detailed high fidelity simulation studies and in experimen- 
tal runs on a heavily instrumented SEGWAY RMP 200 at the 
robotic testbed at the Networked Robotics & Systems Labora- 
tory (NRSL), Pennstate. The results of these experiments ad- 
equately verify the theoretical formulations and the key claims 
made in the preceding sections. 

Remark 6. In depicting simulation results in the sequel, we re- 
fer to "computed paths/plans ". It is important to clarify, what 
we mean by such a computed or simulated path. The computed 
path is the sequence of configuration states that the robot would 
enter, if the uncertainties do not force it to deviate, i.e., the 
path depicts the best case scenario under the uncertainty model. 
Thus, the depictions merely give us a feel for the kind of paths 
the robot would take; in actual implementation, the trajectories 
would differ between runs. Also, when we refer to lengths of the 
computed paths, we are referring to the lengths of the paths in 
the best case scenario, i.e., the tight lower bound on the path 
length that will actually be encountered. 
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Figure 13: Computation of dynamic uncertainty parameters from observed dynamics for SEGWAY RMP 200 at NRSL, Pennstate: (a) shows the observed distribu- 
tion D of the deviation A for different runs, (b) shows the distribution of the delay rj* for the various runs, Lower plate in (c) illustrates the expected distribution E(D) 
for deviation A while the upper plate in (c) enumerates the probabilities of the uncontrollable transitions to the neighboring cells, and the coefficient of dynamic 
deviation y(G^ D ) 
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Figure 14: Simulation results with a circular robot model and different values of the coefficient of dynamic deviation y(Gj^°). Note the response of the navigation 
Nav 



gradient as yCGjf??) is decreased 



7.1. Simulation Results for Circular Robots 

The recursive version of the modified v* planning algorithm 
presented in this paper (See Algorithm [5} is first validated in 
a detailed simulation example as illustrated in Figures [T3t a-c) 
and rnT a-c). The workspace is chosen to be a 50 x 30 grid, with 
obstacles placed at shaded locations, as illustrated. The size of 
the workspace is chosen to correspond with the size of the ac- 
tual test-bed, where experimental runs on the robotic platform 
would be performed subsequently. Plates (a)-(c) in FigurefPflil- 
lustrates the gradient of the optimized measure vector (by short 
arrows) and a sample optimal path from location (15, 10) (up- 



per,left) to the goal (40, 20) (down,right). We note that the "po- 
tential field" defined by the measure gradient converges (i.e. has 
an unique sink) at the goal. Also, note that the coefficient of dy- 
namic deviation y(G|J^ ) ) is decreased, the algorithm responds 
by altering the optimal route to the goal. In particular, the opti- 
mal path for smaller values of y(G/^9P) stay further away from 
the obstacles. The key point to note here is that the the pro- 
posed algorithm guarantees that this lengthening of the route to 
account for dynamic uncertainty is optimal, i.e., further length- 
ening by staying even further from the obstacles yields no ad- 
vantage in a probabilistic sense. This point has a direct practical 
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implication; one that can be verified experimentally as follows. 
Let us assume that we have a real-world robot equipped with 
on-board reactive collision avoidance, by which we can ensure 
that the platform does not actually collide with obstacles under 
dynamic uncertainty, but executes corrections dictated by local 
reactive avoidance. Then, the preceding result would imply that 
a using the correct value of dynamic deviation (for the specific 
platform) in the planning algorithm would result in routes that 
require the least number of local corrections; which in turn en- 
sures minimum time route traversals on the average. 

It is important to note that the assumption of a circular robot 
poses no critical restrictions. Similar results can be obtained 
for more complex models as well, i.e. rectangular platform with 
constrained turn radius. However, extension to multi-body mo- 
tion planning would require addressing the algorithmic com- 
plexity issues that become important even for the recursive v* 
for very large configuration spaces, and is a topic of future 
work. 

7.2. Simulation Results for Non-symmetric Uncertainty 

As stated in the course of the theoretical development, it is 
possible to choose the degree of amortization or averaging that 
one is willing to allow in the specification of the navigation au- 
tomaton. As a specific example, one may choose to compute the 
probabilities of uncontrollable transitions with respect to some 
length of trajectory history; the simplest case is using the pre- 
vious state information to yield non-symmetric deviation con- 
tours (See Figure \T5[. The particular type of deviation contours 
illustrated in Figure Q3] is obtained if the platform has a large 
stopping distance and inertia, and the heading and positional 
estimates are more or less accurate, i.e., the uncontrollability in 
the model is a stronger function of the dynamic response, rather 
than the estimation errors. A typical scenario is the SEGWAY 
RMP 200 with good global positioning capability, in which fac- 
toring in the dynamic response is important due to the inverted- 
pendulum two-wheel kinematics. For this simulation, we use a 
navigation automaton obtained from discretizing an essentially 
4D underlying continuous configuration space. Each state (ex- 
cept the obstacle state) in the navigation automaton maps to a 
discretized pair of locations, reflecting the current robot loca- 
tion and the one from which it transitioned to the current loca- 
tion. We call this the 4D Model to distinguish it from the the 
significantly smaller and simpler 2D model. Note that the 2D 
model can be obtained from the 4D model by merging states 
with the same current location via averaging the probabilities 
of uncontrollable transitions over all possible previous states. 
Also note that (as stated earlier), in the absence of uncertainty, 
the 4D formulation adds nothing new; explicitly encoding the 
previous location in the automaton state gives us no new infor- 
mation. Table [T]enumerates the comparative model sizes. 

Table 1 : Comparison of 4D and 2D Models 
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Figure 15: Non-symmetric deviation contours arising from explicit dependence 
on the last discrete position for the robot 
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Figure 16: Comparative pathlengths for the chosen initial and goal configura- 
tions: 65 steps for 2D model, 51 steps for 4D model, 36 steps for the case with 
no uncertainty 



Comparisons of computed plans for a particular set of initial 
and goal configurations is shown in Figure [16] To accentuate 
the differences in the computed plans, the deviation contours 
were chosen so that the probability of uncontrollable transition 
in the current direction of travel is significantly more compared 
to that of deviating to left or right, i.e., the contours are re- 
ally narrow ellipses. Under such a scenario, the platform is 
more capable of navigating narrow corridors as compared to the 
amortized 2D counterpart. This is reflected in the paths shown 
in Figure [16] where the path for the 4D model is shorter, and 
goes through some of the narrow bottlenecks, while the path 
for the 2D model takes a safer path. Note, the path for the no- 
uncertainty case is even more aggressive, and shorter. In prac- 
tical implementation, when the uncontrollable probabilities are 
identified from observed dynamics or pre-existing continuous 
models, the differences in the two cases are often significantly 
less. 

7.3. Simulation Results for Rectangular Robots 

The proposed planning algorithm is next applied to the case 
of rectangular robots, specifically ones that have a minimum 
non-zero turn radius. We further impose the constraint that the 
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platform cannot travel backwards, which is a good assumption 
for robots that have no ranging devices in the rear, and also for 
aerial vehicles (UAVs). Even assuming planar operation, this 
problem is essentially 3D, with the navigation automaton re- 
flecting the underlying configuration states of the form (x, y, h) 
where h is the current heading, which can no longer be ne- 
glected due to the inability of the patform to turn in place. A 
visual comparison of the models for the circular and rectangu- 
lar cases is shown in Figure [T7|e). The heading is discretized 
at 15° intervals, implying we have 24 discrete headings. This 
also means that for the same planar workspace, the number of 
states in the rectangular model is about 24 times larger the num- 
ber of states for the circular model. Also, while in the circular 
case, we had 8 neighbors, the number of neighboring configura- 
tions increases to 8 x 24 + 24 = 216 However, not all neighbors 
can be reached via controllable transitions due to the restriction 
on the turn radius; we assume a maximum turn of +45° in the 
model considered for the simulation. As explained earlier, all 
the neighbors may be reachable via uncontrollable transitions, 
which reflects uncertainties in estimation (See Figure ITOb. 

We test the algorithm with different values of y(&^?®) as il- 
lustrated in Figure fTTla-d). Note the trajectories become more 
rounded and less aggressive (as expected) as the uncertainty is 
increased. Also note that the heading at the goal is different for 
the different cases. This is because, in the model , we specified 
as goals any state that maps to the goal location in the planar 
grid irrespective of the heading, i.e., the problem was solved 
with essentially 24 goals. Although for simplicity, the theo- 
retical development was presented assuming a single goal, the 
results can be trivially shown to extend to such scenarios. The 
trajectories however, will be significantly different if we insist 
on having a particular heading at the goal (See Figure [T8l. 

In the simulations for the rectangular model, we deliberately 
assume that any neighbor that cannot be reached via a con- 
trolled move is not reachable by an uncontrollable transition 
as well. Although this is not what we expect to encounter in 
field, the purpose of this assumption is to bring out an interest- 
ing consequence that we illustrate in Figures [T9la-b). As ex- 
plained above, this assumption implies that we have little or no 
uncertainty in local heading estimations (since the robot can- 
not turn in place, so there is no uncontrollable transition that 
alters heading in place). It therefore follows, that under this 
scenario, the platform would find it relatively safe to navigate 
narrow passages. This is exactly what we see in Figure [T9l b). 
where the circular robot with same coefficient of dynamic un- 
certainty, really goes out of way to avoid the narrow passage, 
while the rectangular robot goes through. 

7.4. Simulation Results for Mazes 

We simulate planning in a maze of randomly placed static 
obstacles. A sample case with optimal paths computed for dif- 
ferent coefficients of dynamic deviation is illustrated in Fig- 
ure |20T a). A key point to note is that the optimal path is length- 
ens first as y(&^°?) is decreased, and then starts shortening 
again, which may seem paradoxical at first sight. However, 
this is exactly what we expect. Recall that the proposed al- 
gorithm minimizes the probability of collision. Also note that 



there are two opposing effects in play here; while a longer path 
that stays away from the obstacles influences to decrease the 
collision probability, the very fact that the path is longer has an 
increasing influence arising from the increased probability that 
an uncontrollable sequence would execute from some point in 
the path that leads to a collision. At relatively high values of 
y(G^™ ) ), the first effect dominates and we can effectively de- 
crease the collision probability by staying away from the ob- 
stacles thereby increasing the path length. However, at low 
values of y(GjjJ^), the latter effect dominates, implying that 
increased path lengths are no longer advantageous. This in- 
teresting phenomenon is illustrated in Figure l20l'b). where we 
clearly see that the path lengths peak in the y(&^°®) = 0.72 
to y(*Ej^ > ) = 0.85 range (for the maze considered in Fig- 
ure |20fa)). Also note that the configuration space has to be 
sufficiently complex to actually see this effect; which is why 
we do not see this phenomenon in the simulation results pre- 
sented in Figures [T4f a-c). 

7.5. Experimental Runs on SEGWAY RMP 200 

The proposed algorithm is validated on a SEGWAY RMP 
200 which is a two-wheeled robot with significant dynamic un- 
certainty. In particular, the inverted-pendulum dynamics pre- 
vents the platform from halting instantaneously, and making 
sharp turns at higher velocities. At low velocities, however, the 
platform can make zero radius turns. The global positional fix 
is provided via an (in-house developed) over-head multi-camera 
vision system, which identifies the position and orientation of 
the robot in the testbed. The vision system yields a positional 
accuracy of +7.5 cm, and a heading accuracy or +0.1 rad for 
a stationary robot. The accuracy deteriorates significantly for a 
mobile target, but noise correction is intentionally not applied 
to simulate a high noise uncertain work environment. Further- 
more, the cameras communicate over a shared wireless network 
and randomly suffers from communication delays from time to 
time, leading to delayed positional updates to the platform. In 
the experimental runs conducted at NRSL the workspace dis- 
cretized into a 53 x 29 grid. Each grid location is about 4 sq. 
ft. allowing the SEGWAY to fit completely inside each such 
discretized positional state which justifies the simplified circu- 
lar robot modeling. The runs are illustrated in Figure [22] The 
robot was run at various allowed top speeds (v max ) ranging from 
0.5 mph to over 2.25 mph. Only the extreme cases are illus- 
trated in the figure. For each speed, the uncertainty parameters 
were estimated using the formulation presented in Section 
The sequence of computational steps for the low velocity case 
(v max = 0.5 mph) are shown in Figure [13] Note the coefficient 
of dynamic deviation for the low velocity case turns out to be 
y low = 0.973. For the high velocity case, (v max = 2.25 mph), the 
coefficient is computed to have a value of y'"*' 1 = 0.93 (calcula- 
tion not shown). Also, the robot is equipped with an on-board 
low-level reactive collision avoidance algorithm, which ensures 
that the platform does not actually collide due path deviations; 
but executes local reactive corrections when faced with such sit- 
uations. The platform is equipped with multiple high frequency 
sonars, infra-red range finders and a high-precision SICK LMS 
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Figure 17: Comparative trajectories for rectangular model 



200 laser range finder. The data from these multiple ranging de- 
vices, mounted at various key locations on the platform, must 
be fused to obtain correct situational awareness. In this paper, 
we skip the details of this on-board information processing for 
the sake of brevity. The overall scheme is illustrated in Fig- 
ure ED 

In the experimental runs, we choose two waypoints (marked 
A and E) in Figure [22] (plates a,b,d,e), and the mission is to 
plan and execute the optimal routes in sequence from A to B, 
back to A and repeat the sequence a specified number of times 
(thirty). This particular mission is executed for each top speed 
for a range of y values, namely with y e [0.75, 1] with incre- 



ments of 0.1. The expectation is that using the correct coef- 
ficient of dynamic deviation (as computed above for the two 
chosen speeds) for the given speed, would result in the mini- 
mum number of local corrections, leading to minimum average 
traversal times over thirty laps. 

The results are summarized in plates (c) (for the low ve- 
locity case) and (f) (for the high velocity case) in Figure [22] 
Note the fitted curve in both cases attain the minimal point very 
close to the corresponding computed y values, namely, 0.97 for 
Vmax = 0.5 mph and 0.92 for v max = 2.25 mph. A visual com- 
parison of the trajectories in the plates (a) and (d) clearly re- 
veal that the path execution has significantly more uncertainties 
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Figure 18: Comparative trajectories for rectangular model for 7(G{JJ^ > ) = 0.8: Case (a) we demand that the robot reach the goal with a specified heading (-150°). 
Case (b): Any heading at the goal is acceptable 
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Figure 19: Comparative trajectories for rectangular model and circular model to illustrate the effect of different uncertainty assumptions: (a) Trajectories in the 
absence of uncertainty (b) Trajectories with 7(GjfP^) = 0.8. 



in the high velocity case. Also note, that the higher average 
speed leads to repeated loss of position fix information in lo- 
cations around (row = 20, column = 35). Plates (b) and (e) 
illustrate the sequence of waypoints invoked by the robot in the 
two cases, being the centers of the states in the navigation au- 
tomaton that the robot visits during mission execution. Note 
that in the high velocity case, the variance of the trajectories is 
higher leading to a larger set of waypoints been invoked. Note 
that three distinct zones (denoted as Zone A, Zone B and Zone 
C) can be identified in the plates (e) and (f) of Figure [22] Zone 
A reflects the operation when y is (incorrectly assumed to be) 
too large, leading to too many corrections, and hence execution 
time can be reduced by reducing y. In Zone B, reducing y in- 
creases execution time, since now the trajectories becomes un- 
necessarily safe, i.e. stays away from obstacles way more than 
necessary leading to longer than required paths and hence in- 
creased execution time. Zone C represents a sort of saturation 
zone where reducing y has no significant effect, arising from the 
fact that the paths cannot be made arbitrarily safe by increasing 



path lengths. Although the experimental runs were not done for 
smaller values of y, we can say from the experience with maze 
simulations (See Section l74l >. that the execution times will start 
reducing again as y is further reduced. 

These results clearly show that the approach presented in this 
paper successfully integrates amortized dynamical uncertainty 
with autonomous planning, and establishes a computationally 
efficient framework to cyber-physical motion planning. 

8. Summary & Future Research 

The recently proposed PFSA-based path planning algorithm 
v* is generalized to handle amortized dynamic uncertainties in 
plan execution, arising from the physical limitations of sensing 
and actuation, and the inherent dynamic response of the physi- 
cal platforms. The key to this generalization is the introduction 
of uncontrollable transitions in the modified navigation automa- 
ton, and showing that v* can be implemented in a recursive 
fashion to guarantee plan optimality under such circumstances. 
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Figure 20: Effect of dynamic uncertainty on the optimal path lengths computed by v*. Plate (a) illustrates the optimal paths for yCGjJJ^ 5 ) = 1.0,0.8,0.4,0.1 from 
the start location marked by S and the goal marked by G. Plate (b) illustrates the variation of the length of the optimal paths as a function of 7(G{^ > ) for the maze 
illustrated in (a). 




Figure 21: Autonomous navigation scheme with v* implementation on heavily 
instrumented Segway RMP (shown in inset). 



The theoretical algorithmic results is verified in detailed high- 
fidelity simulations and subsequently validated in experimental 
runs on the SEGWAY RMP 200 at NRSL, Pennstate. 

8.1. Future Work 

Future work will extend the language-measure theoretic 
planning algorithm to address the following problems: 

1 . Multi-robot coordinated planning: Run-time complex- 
ity grows exponentially with the number of agents if one 
attempts to solve the full Cartesian product problem. How- 
ever v* can be potentially used to plan individually fol- 
lowed by an intelligent assembly of the plans to take inter- 
action into account. 

2. Hierarchical implementation to handle very large 
workspaces: Large workspaces can be solved more effi- 
ciently if planning is done when needed rather than solving 



the whole problem at once; however care must be taken to 
ensure that the computed solution is not too far from the 
optimal one. One the areas of current research is an al- 
gorithmic decomposition of the configuration space such 
that individual blocks are solved in parallel on communi- 
cating processors, with the interprocessor communication 
ensuring close-to-global optimality. We envision such an 
approach to be ideaally suited to scenarios involving multi- 
ple agents distributed over a large workspace which coop- 
eratively solve the global planning problem in an efficient 
resource-constrained manner. 
3. Handling partially observable dynamic events: In this 
paper all uncontrollable transitions are assumed to be per- 
fectly observable. Physical errors and onboard sensor fail- 
ures may need to be modeled as unobservable transitions 
and will be addressed in future publications. A generaliza- 
tion of the measure-theoretic optimization technique under 
partial observation has been already reported B48I1 . The fu- 
ture goal in this direction is to incorporate the modifica- 
tions to allow v* handle loss of observation and feedback 
information. 
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